Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding overloading functionality for add, sub and neg #173

Merged
merged 7 commits into from
May 21, 2021

Conversation

adehecq
Copy link
Member

@adehecq adehecq commented May 20, 2021

With this PR, it will now be possible to add, subtract and calculate the opposite of a Raster object.
Here is a quick example:

import geoutils as gu
r1 = gu.Raster(gu.datasets.get_path('landsat_B4'))
r2 = gu.Raster(gu.datasets.get_path('landsat_RGB'), bands=1)
r3 = r1 + r2
r3 = -r1
r3 = r1 - r2

This also works with numpy arrays or single numbers, e.g.:

import numpy as np
r3 = r1 + np.ones(r1.data.shape)
r3 = r1 + 5

Question:
One thing I have completely ignored for now is data type. If the two rasters are 'uint8' like in the example, the output will also be of type uint8. This will lead to weird results for values > 255 (and similarly for other types with larger values).
Should a condition be added to check that the output value is within the dtype range?
Should a warning be raised? Should the output data type be set accordingly?
What do you think @erikmannerfelt @atedstone ?

To Do:
Add some tests.

@adehecq adehecq added the enhancement Feature improvement or request label May 20, 2021
@adehecq adehecq self-assigned this May 20, 2021
@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

For whatever reason I accidentally included the commit from my previous PR...

@erikmannerfelt
Copy link
Contributor

This is looking great!!

  1. Regarding the dtypes, I think it should be done like numpy. I don't know how numpy does it haha but there must be somewhere in the documentation that says it.
  2. Related to dtypes, maybe an .astype() or similar method should exist to help with these arithmetics?
  3. Please also add a r1 / r2 operator if possible!
  4. What happens if one raster has an incompatible amount of bands?

@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

This is looking great!!

1. Regarding the dtypes, I think it should be done like numpy. I don't know how numpy does it haha but there must be somewhere in the documentation that says it.

Well, this is already following numpy's way. Under the hood it's simply an operation between numpy arrays so the output type of the array is set by numpy and then the Raster dtype is set from the numpy array. In my example with the numpy array for example, the output Raster has a type float64.

2. Related to dtypes, maybe an `.astype()` or similar method should exist to help with these arithmetics?

Mmmh, interesting idea! We sort of have this functionality already with the 'do-it-all' reproject function, which can convert dtype, except that right now if only the data type changes it won't do anything. So it could be a separate method yes.

3. Please also add a `r1 / r2` operator if possible!

I will! And r1 * r2.

4. What happens if one raster has an incompatible amount of bands?

If data.shape is not the same, an error is raised. So for now, both Rasters need to have the same amount of bands loaded into memory. We could also decide to use the first band by default, but it's so easy to load only a single or part of the bands with self.load that I would rather leave it as it is.

@erikmannerfelt
Copy link
Contributor

This is looking great!!

1. Regarding the dtypes, I think it should be done like numpy. I don't know how numpy does it haha but there must be somewhere in the documentation that says it.

Well, this is already following numpy's way. Under the hood it's simply an operation between numpy arrays so the output type of the array is set by numpy and then the Raster dtype is set from the numpy array. In my example with the numpy array for example, the output Raster has a type float64.

2. Related to dtypes, maybe an `.astype()` or similar method should exist to help with these arithmetics?

Mmmh, interesting idea! We sort of have this functionality already with the 'do-it-all' reproject function, which can convert dtype, except that right now if only the data type changes it won't do anything. So it could be a separate method yes.

3. Please also add a `r1 / r2` operator if possible!

I will! And r1 * r2.

4. What happens if one raster has an incompatible amount of bands?

If data.shape is not the same, an error is raised. So for now, both Rasters need to have the same amount of bands loaded into memory. We could also decide to use the first band by default, but it's so easy to load only a single or part of the bands with self.load that I would rather leave it as it is.

Sounds good with the dtype handling and band errors!

An .astype(whatever) method could just be added that runs .from_array with self._data.astype(whatever)

@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

Actually, the question of dtypes is a problem for the subtraction...
Here's a minimal example that fails when subtracting a raster with dtypes float32 and uint8:

width = height = 5
transform = rio.transform.from_bounds(0, 0, 1, 1, width, height)
r1 = gu.Raster.from_array(np.random.randint(0, 255, (height, width)).astype('float32'), transform=transform, crs=None)
r2 = gu.Raster.from_array(np.random.randint(0, 255, (height, width), dtype='uint8'), transform=transform, crs=None)
r3 = r1 - r2
assert np.all(r3.data == r1.data - r2.data)

That's because running r1 - r2 will actually run r1 + -r2, and -r2 will be operated under the assumption of a uint8 type.

@erikmannerfelt
Copy link
Contributor

Why does it fail? Because -r2 doesn't work with a uint8 dtype? How does that work for uint8 ndarrays?

@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

Why does it fail? Because -r2 doesn't work with a uint8 dtype? How does that work for uint8 ndarrays?

Because of this:
np.float32(10) - np.uint8(5) -> 5.0
np.float32(10) + -np.uint8(5) -> 261.0

@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

It's actually a simple fix, I just need to make sure that the second argument has a compatible type with self, before doing the negation.

@erikmannerfelt
Copy link
Contributor

Nice!! Does * and / work or is this still in the works?

@adehecq
Copy link
Member Author

adehecq commented May 20, 2021

Nice!! Does * and / work or is this still in the works?

Still in the work. I'm working on the tests for the new methods right now.

@adehecq
Copy link
Member Author

adehecq commented May 21, 2021

@erikmannerfelt, if that version is ok, should we accept this PR and I'll create the multiplication/division in a new PR?

Copy link
Contributor

@erikmannerfelt erikmannerfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a new PR for the / and * operators!

@adehecq adehecq merged commit 5f2e317 into GlacioHack:master May 21, 2021
@adehecq adehecq deleted the overloading branch May 27, 2021 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature improvement or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants